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Abstract 

In this review, we give an introduction to the structural and func- 
tional properties of the biological networks. We focus on three major 
themes: topology of complex biological networks like the metabolic and 
protein-protein interaction networks, nonlinear dynamics in gene regula- 
tory networks and in particular the design of synthetic genetic networks 
using the concepts and techniques of nonlinear physics and lastly the ef- 
fect of stochasticity on the dynamics. The examples chosen illustrate the 
usefulness of interdisciplinary approaches in the study of biological net- 
works. 



1 Introduction 

Networks are widely prevalent in all spheres of life [pi ^, ||]. A network of ac- 
quaintances is the simplest example one can think of. Social, economic and 
political networks of various kinds are part of human society. The internet, a 
network of information resources, plays a vital role in the gathering, sharing 
and transmission of information. A network consists of nodes connected by 
links. Figure 1 shows the example of a network in which the solid circles denote 
the nodes and the solid lines the links. Some examples of real life networks 
are as follows: in a network describing an electrical power grid, the genera- 
tors, transformers and substations are the nodes and the high-voltage trans- 
mission lines connecting them the links. In the World Wide Web (WWW), 
the documents/pages constitute the nodes. These are connected to other docu- 
ments/pages through links. In a collaboration graph of movie actors, the nodes 
represent the actors. Two actors are connected by a link if they appear in the 
same movie. In a citation network, the nodes are the papers published in refer- 
eed journals. A paper is linked to all the other papers it cites. Cellular processes 
are controlled by various types of biochemical networks. A metabolic network 
[|| controls the processes which generate mass and energy from nutritive matter. 
The nodes in such a network are the substrates such as ATP, ADP and H2O. 
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Two substrates are connected by a link if both of them participate in the same 
biochemical reaction. Traditional cell biology assigns specific functional roles 
to individual proteins, such as catalysts, signalling molecules and constituents 
of cellular matter. In the post-genomic era, there is an increasing emphasis on 
understanding the functions of proteins as parts of an interacting network and 
also on the collective, emergent properties of the network. In a protein-protein 
interaction network 0], the nodes represent the proteins. A link exists between 
two nodes if the corresponding proteins have a direct physical interaction. 

The networks discussed above have complex topology. Spectacular advances 
in computerisation of data acquisition (the Human Genome Project is a prime 
example) have made it possible to construct large databases which contain in- 
formation on the topology of real life networks. The advent of powerful comput- 
ers has given rise to extensive investigations of networks containing millions of 
nodes. The interesting fact emerging out of these studies is that biological net- 
works share common topological features with non-biological networks. There 
appears to be a general blueprint for the large scale organisation of several of 
these networks. In Section 2 of this review, we discuss two types of biological 
networks, namely, the metabolic networks of several organisms and the protein- 
protein interaction networks associated with the yeast S. cerevisiae ^ and the 
human gastric pathogen H. Pylori [jj]. The major topological features of these 
networks are described and the similarity in the design principles of large-scale 
biological and non-biological networks pointed out. 

Gene regulatory networks are the most significant examples of biological 
networks. Gene expression and regulation are the central activities of a living 
cell j8|. Genes are fragments of DNA molecules and determine the structure 
of functional molecules like RNAs and proteins. In each cell, at any instant of 
time, only a subset of genes present is active in directing RNA / protein synthe- 
sis. The gene expression is "on" in such a case. The information present in the 
gene is expressed through the processes of transcription and translation. Dur- 
ing transcription, the sequence along one of the strands of the DNA molecule is 
copied onto a RNA molecule (mRNA ) . The sequence of the mRNA molecule 
is then translated into the sequence of amino acids, which determines the func- 
tional nature of the protein molecule produced. In a gene regulatory network, 
the protein encoded by one gene can regulate the expression of other genes. 
These genes in turn produce new regulatory proteins which control still other 
genes. A protein may also regulate its own level of production through an au- 
toregulatory feedback process. The occurrence of cell differentiation, when an 
organism grows from its embryonic stage, depends upon the selective switching 
on of gene expression in individual cells. All these cells have identical sets of 
genes but follow different developmental pathways depending upon the patterns 
of gene expression in the cells. Thus distinct types of cells such as hair and skin 
cells are obtained. Gene expression is also regulated in metabolism and progres- 
sion through the cell cycle as well as in responses to external signals. Infected 
cells can multiply because the expression of certain genes is "on" in these cells 
whereas in normal cells the expression of the same genes is "off". 

Despite a vast amount of experimental data, the complex dynamical pro- 
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cesses involved in gene regulation are not fully understood as yet. A large 
number of theoretical studies has been undertaken j9) but only a few of these 
make quantitative predictions in agreement with experimental results. Two key 
concepts which emerge out of the theoretical studies are: nonlinearity of the net- 
work dynamics and the role of stochasticity in gene expression and regulation 
[ |l0| . The variables of interest in the network dynamics are the concentrations 
of the mRNAs, proteins and other biomolecules within the cell. The rate of 
change in the concentration of a biomolecular species is a nonlinear function of 
the other variables. The dynamics is governed by a set of coupled non-linear 
differential equations which in most cases are solved numerically. Let U be the 
concentration of, say, a particular type of protein in the cell. The rate of change 
of U is given by 

= (Production - Loss/Decay ) of U per unit time. 
The production term is a nonlinear function of the other concentration variables 
and the loss term is usually proportional to U. In Section 3 of this review, 
we briefly describe the major features of nonlinearity in the dynamics of gene 
networks. There is currently a significant emerging trend to utilise the concepts 
and techniques of nonlinear physics in the actual construction of synthetic gene 
regulatory networks with a variety of applications. In Section 3, a specific 
example of this, namely, the genetic toggle switch [jy] will be given. 

The biochemical rate equations which govern the dynamics of gene regula- 
tory networks are deterministic in nature. Many molecules associated with the 
networks have low intracellular concentrations and consequently fluctuations 
in reaction rates are considerably large. Gene expression involves a series of 
biochemical reactions and due to stochastic fluctuations in the reaction rates, 
proteins are produced in short bursts at random time intervals. In the last few 
years, there is an increasing realization that stochasticity plays a significant role 
in biological processes (l0[ To give an example, consider the situation in 
which two independently produced regulatory proteins A and B are in com- 
petition to control a developmental switch that selects between two pathways 
depending on which protein wins. The protein concentrations have to reach 
effective levels in order to activate the switch. Due to stochastic fluctuations, 
the amounts of proteins A and B produced as a function of time can vary widely 
from cell to cell. In some cells, protein A reaches the effective level first and 
activates the developmental switch along one specific pathway. In the other 
cells, protein B takes control and the other pathway is activated. Thus even a 
clonal cell population exhibits phenotypic variations as the cells follow different 
developmental pathways. Environmental signals can bias the probabilities of 
path choice in a regulatory circuit. Organisms make use of this mechanism to 
increase the probability of survival in a hostile environment. Cells often utilise 
fluctuations (noise) to randomize choices of developmental pathways when such 
randomization is desirable for the survival and growth of the organism. A well- 
known example is that of the phage A lysis-lysogeny network . The bacterial 
E.coli cells, when infected by the virus phage A , can follow two developmental 
pathways: lysis and lysogeny. In the lysogenic state, the infection is dormant. 
Phage A is inert and integrated into the host cell's chromosome. It replicates 
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along with the bacterial DNA and each new cell contains the dormant phage. 
In the lytic state, the infection proliferates. The viral DNA replicates using the 
host cell machinery giving rise to a large number of progeny phage. These in 
turn lyse or burst the host bacterium cell and the infection spreads to more 
cells. Again, due to stochastic fluctuations, the cell population divides into two 
subpopulations: lysogenic and lytic. The selection of a developmental pathway 
after the host cell is infected is not deterministic but probabilistic. In Section 4 
of this review, the effect of stochasticity on the dynamics of the A-phage network 
will be briefly discussed. The network illustrates the competitive control of a 
developmental switch by two regulatory proteins. 

As already mentioned before, gene expression/regulation involves several bio- 
chemical reactions with appreciable stochastic fluctuations. Gillespie |fl4| , |l5| 
has proposed a Monte Carlo simulation algorithm to describe the kinetics of 
coupled stochastic reactions. This method is physically more rigorous than the 
conventional differential equation approach. The inherent assumption in the 
latter method is that the temporal changes in the concentrations of reacting 
molecules are both continuous and deterministic. The assumption is not true if 
the concentrations are small and the reaction rates slow or if the system under- 
goes large, rapid and discrete transitions. In the Gillespie algorithm, changes in 
the numbers of the reacting molecules occur in integral numbers brought about 
by random, distinct reaction events. The Gillespie algorithm is described in de- 
tail in Section 4 and some illustrative examples are given. Recent experiments 
at the level of a single cell have shown that gene expression occurs in abrupt 
stochastic bursts |jl^, [l7], [l^, 19 1. Further, in an ensemble of cells, the levels of 
proteins produced have a bimodal distribution. In a large fraction of cells, the 
gene expression is either off or has a high value. We have proposed a model of 
gene expression the essential features of which are stochasticity and coopera- 
tive binding of RNA polymerase, the molecule responsible for transcription p0| . 
The model can reproduce the bimodality observed in experiments. We include a 
description of the model in Section 4 to give an additional example of the effect 
of stochasticity on gene regulation. Section 5 of the review contains concluding 
remarks. 

The emphasis in this review is on recent studies of biological networks. There 
are a number of exhaustive reviews and books on earlier work. The three major 
themes that several recent studies focus on are: topology of networks, nonlinear 
dynamics and its consequences and the role of stochasticity in biological pro- 
cesses. The present review is meant to be an introduction to these themes and 
to highlight the fact that interdisciplinary approaches are essential to develop 
an integrated understanding of biological networks. 



2 Topology of complex networks 

Many real life networks have a complex structure. The mathematicians Erdos 
and Renyi |H| were the first to propose a model of a complex network known 
as a random graph. One starts with N nodes and connects every pair of nodes 
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with probability p. The graph thus has approximately pJV ^~^ links distributed 
in a random manner. Studies of real life networks, however, reveal that these 
cannot be described as random graphs. This distinction is possible on the basis 
of quantitative measurements of certain topological features which we define 
below. Several complex networks including the random graph are described as 
small world networks [jl], [|, ||, ^2) . The small world idea implies that though the 
networks are large in size (the number of nodes in a network is a measure of its 
size), any pair of nodes can be connected by a short path. The distance between 
two nodes is given by the number of links along the shortest path connecting 
the nodes. In Figure 1, the distance between the nodes A and B is three. The 
diameter of the network, also known as the average path length 1, is the average 
of the distances between all pairs of nodes. The global population is huge but 
still we live in a small world as any random pair of individuals are connected to 
each other through a short path of intermediate acquaintances. This was first 
established by Stanley Milgram |^3| who found out that the average path length 
of intermediate acquaintances is six. In a small world network, the diameter 
scales as the logarithm of the number of nodes. 

The second measurable topological feature of a complex network is its degree 
distribution jl], ^, 0] . The number of links by which a node is connected to the 
other nodes varies from node to node. Let P{k) be the probability that a 
randomly selected node has exactly k links. Equivalently, P{k) is the fraction 
of nodes, on an average, which has exactly k links. One can define an average 
degree (k) of the network, the degree of a node being the number of links 
attached to the node. In a random graph, the links are established randomly 
and most of the nodes have degrees close to (k) . The degree distribution P(k) 
vs. k is Poissonian. It is strongly peaked at k = {k) and has an exponential 
decay for large k, i.e., P(k) ~ e~ k for k 3> (k) and k <C (k) . In many real 
life networks, the degree distribution P(k) has no well-defined peak but has a 
power-law distribution 

P{k)~k-~< (1) 

where the exponent 7 is a numerical constant. Such networks are known as 
scale-free networks because they are not tied to a specific scale. P(k) has a 
finite value over a wide range of k values. The power-law form of the degree 
distribution implies that the networks are extremely inhomogeneous unlike in 
the case of a random graph. In a scale- free network, there are many nodes with 
few links and a few nodes with many links. The highly connected nodes play 
a key role in the functionality of the network. Both the random graph and the 
scale-free networks are small world networks. 

The third topological quantity which is measurable is known as the clustering 
coefficient J|, |2^| . The coefficient is a measure of the tendency of the nodes of the 
network towards clustering. In a social network, the individuals are the nodes 
and two nodes are connected by a link if the individuals are acquainted with 
each other. In such a network, one's friend's friends are also likely to be one's 
friends giving rise to a clustering of acquaintances. The clustering coefficient is 
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defined in the following manner. Let us select a specific node i in the network 
which is connected by fcj links to fcj other nodes. If these first neighbours are 
all part of a cluster, there would be ki< - k ^~ 1 "> links between them. The clustering 
coefficient Cj of node i is given by 

C % = 2El (2) 

kiiki 1 ) 

where Ei is the number of actual links which exist between the k{ nodes. The 
clustering coefficient C of the whole network is obtained by taking an average 
over all the Cj values. The utility of the clustering coefficient is demonstrated in 
the following example. The neural network of the nematode worm C. Elegans is 
small in size [^, The number of neurons which constitute the nodes of the 
network is 282. A link exists between two nodes if the neurons are connected by 
either a synapse or a gap junction. The average degree of the network is (k) =14. 
Now consider a random graph of the same size and average degree. The average 
path lengths for the neural network and the random graph are similar, 2.65 and 
2.25 respectively. Is the neural network then a random graph? The answer is 
no as the clustering coefficient of the former has the value 0.28 which is much 
larger than the value 0.05 in the case of the latter network. Examples of real 
life networks which are scale-free are [|[ |2?|: the collaboration graph of movie 
actors (size N of the network = 212 250 nodes, average degree (k) = 28.78, the 
exponent 7 in Eq.(l) is 7 = 2.3 ), the WWW (N = 325 729, (k) = 5.46, 7 = 
2.1) and the network of citations (N = 783 339 papers, (k) = 8.57, 7 = 3). The 
results are obtained from available databases. A more comprehensive and up to 
date list of networks is given in Ref. 

We now discuss some complex, biological networks. Recently, Jeong et al Q 
have systematically investigated the topological properties of the core metabolic 
networks of 43 different organisms representing all the three domains of life. The 
data on these organisms are available in the WIT (What Is There) database. As 
already mentioned in the Introduction, the nodes of the metabolic network are 
the different substrates. Two substrates are connected by a link if they partic- 
ipate in the same biochemical reaction. The metabolic networks have different 
sizes, the less complex organisms having smaller sizes. There is considerable 
variation in the individual constituents and the pathways of the networks. Yet 
they display identical topological scaling properties which resemble those of 
complex non-biological networks. The metabolic networks have been found to 
belong to the class of scale-free networks. The probability that a substrate par- 
ticipates in k reactions has a power-law distribution. The links in a metabolic 
network are directed as many biochemical reactions are preferentially catalysed 
in one direction. For each node, one has to distinguish between incoming and 
outgoing links. Correspondingly, there are two exponents 7j„ and "f out . The 
exponents turn out to have the same value of 2.2. 

In the metabolic network, the distance between two substrates is given by 
the number of links (reactions) in the shortest biochemical pathway connect- 
ing the two substrates. A surprising result obtained by Jeong et al is that the 
diameter of the metabolic network is the same for all the 43 organisms, i.e., 
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it does not depend upon the number of substrates (nodes) belonging to the 
network. This is counterintuitive and only possible if with increasing organism 
complexity pre-existing individual substrates are increasingly connected in or- 
der to maintain a more or less constant network diameter. In support of this 
conjecture, Jeong et al found that the average number of reactions in which 
a certain substrate participates increases as the number of substrates in the 
organism increases. Conservation of the network diameter may be favourable 
for the survival and growth of an organism. A larger diameter would possibly 
diminish the organism's ability to respond to changes in an efficient manner. 

The scale-free character of the metabolic network implies that a few hubs 
which are highly connected play a dominant role in the functioning of the net- 
work. On sequential removal of these highly-connected nodes, the network di- 
ameter rises sharply and ultimately the network disintegrates into isolated frag- 
ments. On the other hand, the network diameter does not change appreciably 
when the nodes with a few links are removed from the network. Scale-free net- 
works, in general, are robust against random mutations/errors but vulnerable to 
attacks targeted at highly connected nodes. Complex communication networks 
are surprisingly robust, local failures rarely hamper the global transmission of 
information. Organisms can grow and survive in hostile environments due to 
the error tolerance of the underlying metabolic network. A random graph is 
not as robust against random mutations/errors. Mutagenesis studies in-silico 
and in- vivo |26) have established the remarkable error tolerance of the metabolic 
network of E.coli on removing a large number of metabolic enzymes. Jeong et 
al, in their study of the metabolic networks of organisms found that only ~ 4% 
of all the substrates present in all the 43 organisms are present in all the species. 
The striking fact is that the small number of substrates, common to all species, 
turn out to be the most highly connected ones. On the other hand, there are 
species-specific differences in the case of less-connected substrates. 

Jeong et al in a separate study |5| have investigated the protein-protein 
interaction network of the yeast S.cerevisiae. The network has 1870 proteins 
as nodes which are linked by 2240 direct physical interactions identified mostly 
by systematic two-hybrid experiments. Actual measurements show that the 
probability P{k) that a given yeast protein interacts with k other yeast proteins 
has a power-law distribution with an exponential cutoff at k c = 20. 

(k + k ) 

P(k) ~ (k + k )-~<e — (3) 

with k = 1 and 7 = 2.4. The protein-protein interaction network of the bac- 
terium H. Pylori displays similar topology. For the metabolic networks, the 
exponent 7 has the value 2.2. The value of 7 falls in the range 2.0-2.5 for many 
scale-free networks. Like the metabolic network and other scale-free networks, 
the protein-protein interaction network is found to be immune to random muta- 
tions. The removal of highly connected nodes may, however, disrupt the network 
function. The protein product of the p53 tumor-suppressor gene is one of the 
most highly connected proteins found in human cells. Mutations of p53 gene 
affect cellular functions severely from a biomedical point of view. 
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In fact, Jeong et al's study on the protein-protein interaction network in 
yeast shows that proteins with five or lesser number of links constitute ~ 93% 
of the total number of proteins but only ~ 21% of them are essential so that 
the removal of such proteins proves to be lethal. In contrast, only ~ 0.7% of the 
total number of proteins have more than 15 links but single deletion of ~ 62% 
of these severely affects the functioning of the network. It is possible that the 
proteins which constitute the highly connected nodes in a network share common 
structural features. These features favour the binding of many different types 
of proteins to the proteins in question. The scale-free character of both the 
metabolic and protein-protein interaction networks suggests the evolutionary 
selection of a common large scale structure of biological networks. Studies of 
other biological networks are expected to provide further evidence for this idea. 

3 Nonlinear dynamics 

The dynamics of gene regulatory networks are described by coupled nonlinear 
p.d.e.'s which can be collectively represented as 

where X(t) is the N -component state vector (X\(t), -Xjv(i)) and / is a set of 
nonlinear functions fi(X, R), /jv(X, R) . There are thus N coupled p.d.e.'s 
and an individual p.d.e. is of the form 

^- = M X U ....X N ,R) (5) 

There are in total N species of biochemical molecules participating in M re- 
actions. Xi(t) (i — 1,...,N) represents the concentration of the ith molecular 
species at time t . R represents a set of control parameters. The functions f-s 
are nonlinear functions of the X[s and the specific forms of the functions are 
determined by the structures and rate constants of the M chemical reactions. 
As an example consider the set of reactions 

P -> A (6) 
A -> B (7) 
A + 2B -> 3B (8) 

B -> C (9) 

The reactions represent the conversion of the precursor species P into a final 
product C via a sequence of four reactions involving two intermediates A and 
B . The third reaction is autocatalytic as B catalyses its own production. The 
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second reaction represents the uncatalysed conversion of A to B and the last 
reaction shows that the catalyst B decays into the product C . The equations 
are assumed to be irreversible. Also, the concentration of the reactant P is 
assumed to be constant over a reasonable period of time. This is possible if 
the initial concentration of P is large. Let po (constant), a and b denote the 
concentrations of the molecular species P , A and B . The decay product C does 
not participate in any further reaction and so does not influence the chemical 
kinetics. The rates of the four successive chemical reactions are koPo , k u a , 
k c ab 2 and kjb respectively where ko , k u , k c , kd are the rate constants. The 
equations governing the chemical kinetics are 

^ = k p - k c ab 2 - k u a (10) 

^ = k u a + k c ab 2 - k d b (11) 
at 

In the general scheme of p.d.e.'s shown in Eq. (5), N = 2, i.e., there are two 
molecular species A and B participating in M = 4 chemical reactions. X\ = a 
and X2 = b are the concentrations of the moleculeas A and B. The r.h.s.'s 
of Eqs. (10) and (11) are the nonlinear functions fi{Xi,X2) and f2(X\, X2), 
the nonlinearity arising from the autocatalytic term k c ab 2 . The rate constants 
together with po constitute the control parameters R. 

In the general case, imagine an abstract N - dimensional state space with 
axes X\,....,Xn ■ The state of the system at any instant of time, say to, is 
given by the N -component state vector X (to). In the state space, this state 
is represented by a single point. The time evolution of the system gives rise 
to a trajectory in the state space. The trajectory may end up at a fixed point 
X*. At this point, the rates of change of all the variables in the system are 
exactly zero, i.e., the l.h.s.'s of the N equations in Eq.(5) are zero. The system 
is said to be in the steady state at the fixed point. At this point, the state of 
the system remains unchanged as a function of time. The only way of changing 
the state of the system is to apply perturbations to it. A fixed point is stable 
if small perturbations around the point eventually damp out. The stable fixed 
point acts as an attractor to the states in its vicinity. The corresponding region 
in the state space is called the basin of attraction. The nonlinear dynamics may 
give rise to more than one fixed point. If there are two stable fixed points, the 
system is bistable, i.e., two stable steady states are possible. One can similarly 
define multistability. 

The other long-term possibilities for the trajectory in the state space are a 
limit cycle and a strange attractor. In the first case, the trajectory goes towards 
a closed loop and eventually circulates around it forever. In physical terms, this 
corresponds to stable oscillations in the system. The strange attractor is a set 
of states to which the trajectory is confined, never stopping or repeating. Such 
aperiodic motion is often indicative of chaos in the system. We now discuss the 
role of the control parameters R (Eqs. (4) and (5)) in the nonlinear dynamics 
of a system. By varying these parameters, one can bring about changes in the 
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qualitative structure of the dynamics. Such changes are known as bifurcations. 
For example, as a parameter is changed, a steady state can become unstable and 
replaced by stable oscillations. A system with one stable steady state changes 
over to multistability, i.e., the system can exist in multiple steady states. To 
give a simple example of bifurcation, consider the rate equation 

dx n ,„„. 

dt=» X - X (12) 
There are two fixed points of this equation: x* = and x* — fi. To determine 
the stability of the fixed points, one undertakes what is known as the linear 
stability analysis. One determines the time evolution of a small perturbation 
Sx(t)(= x(t) — x*) around the fixed point. By substituting x(t) — x* + Sx(t) 
in Eq.(12) and ignoring terms of the order of (5x(t)) 2 , one obtains 8x{t) ~ e Mt 
when x* = 0. The fixed point is stable if /i < since 5x(t) reduces to zero during 
time evolution. The fixed point is unstable if (X > and [i c = is the bifurcation 
point. If x* = /x, then 5x(t) ~ e _Alt . Hence the fixed point is unstable if [i 
< and stable for \x > 0. Different types of bifurcation are possible a detailed 
discussion of which is given in standard textbooks and reviews j2?| [29) on 
nonlinear dynamics. 

If there is more than one stable fixed point, a switch-like behaviour is pos- 
sible. In the case of bistability, the system remains in one stable state until a 
sufficiently large perturbation drives the system to the other stable state. The 
system continues to remain in the latter state even after the perturbation is re- 
moved. The A-phage lysis-lysogeny network offers an example of bistability 
The lytic and the lysogenic states are the two possible steady states. A transi- 
tion from the lysogenic to the lytic state occurs on irradiating with ultra-violet 
light. In a gene regulatory network, a negative (positive) feedback implies that a 
gene product inhibits (promotes) its own level of activity. To give an example, 
a protein which represses the transcription of its own gene operates through 
negative feedback. It has been found that negative (positive) feedback increases 
(decreases) stability in gene regulatory systems Real life gene regulatory 
networks are often complex. Some of the examples are the A-phage lysis-lysogeny 
circuit, the regulatory network for the activation of the tumour-suppressor pro- 
tein p53 pi] a nd the bacteriophage T7 (another lytic phage which infects E.coli 
) network |]32[|. Computational modelling studies of these networks have been 
undertaken with a view to explain experimental results. The quantitative agree- 
ment between theory and experiment is most often not good. The reasons are 
two fold: the complex nature of the networks and the difficulty in carrying out 
actual experiments on them. Computational as well as mathematical modelling 
of simpler networks is more extensive. Such networks incorporate the essential 
features of their more complex counterparts. The models seek to explain ex- 
perimental results at a qualitative level. There are also abstract mathematical 
models of gene expression / regulation which highlight the general principles and 
their outcomes. There are already some good reviews and books on the com- 
putational and mathematical modelling of gene regulatory networks J|, |33|, Q . 
For the purpose of this review, we pick on just one example, that of a synthetic 
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gene regulatory network which illustrates the importance of nonlinearity in the 
dynamics of the network. 

Gardner et al have constructed and tested a synthetic, bistable gene 
regulatory network based on the predictions of a simple mathematical model. 
The network is called a genetic toggle switch and consists of two repressors 
(proteins) and two promoters. The enzyme RNA polymerase (RNAP) binds to 
the promoter region of a DNA sequence to initiate the process of transcription. 
The initial binding of RNAP to a promoter can be prevented by the binding of a 
regulatory protein to an overlapping segment of DNA, called operator. The gene 
expression is off in this case. Fig. 2 shows a simple sketch of the toggle network. 
The two promoters are designated as Pl and Ptrc ~2.Pl drives the expression 
of the lad gene and Ptrc — 2 that of the cl gene. The lad and cl genes express 
the proteins of the same names. The proteins mutually inhibit the production 
of each other, hence the name repressor. The lad proteins form tetramers and 
the tetramer binds to operator sites adjacent to the Ptrc — 2 promoter, blocking 
the transcription of the cl gene in the process. The cl proteins, when produced, 
form dimers. The repressor dimer cooperatively binds to the operator sites in 
the vicinity of the Pl promoter. As a result, transcription of the lad gene is 
not possible. 

The nonlinear dynamics of the toggle network are governed by the following 
two equations: 



dU ct\ 



dt 1 + 
dV ct2 



U (13) 
V (14) 



dt 1 + W 

where U and V are the concentrations of lad and cl proteins respectively, a\ 
and a2 are the effective rates of synthesis of lad and cl proteins, (i is the 
cooperativity of repression of the Pl promoter and 7 the same in the case of 
the Ptrc — 2 promoter. Fig. 3 reveals the origin of bistability in the system. 
The nullclines ^ = and ^ = intersect at three points. These are the 
fixed points (steady states) of the dynamics. Two of the fixed points are stable 
and the third unstable. The bistability occurs provided (3, 7 > 1 (cooperative 
repression of transcription) and the rates of synthesis of the two repressors are 
balanced. If the rates are not balanced, the nullclines intersect at a single point 
giving rise to a single stable steady state (monostability) . 

In the region of bistability, the two stable steady states correspond to (1) 
State 1 (high V / low U ) and (2) State 2 ( low V / high U ) respectively. There 
are two basins of attraction, one above the separatrix and the other below 
it. In the log{a\) vs. log(ct2) parameter space, bifurcation lines separate the 
monostable and bistable regions jll). The size of the bistable region decreases 
on reducing the cooperativity of repression ((3 and 7 ) . The parameters a\ , ct2 , (3 
and 7 act as the control parameter R changing which a transition (bifurcation) 
between monostability and bistability occurs. 
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In the region of bistability, the toggle is flipped between the stable states 
(States 1 and 2) using transient chemical or thermal induction. The chemical 
agent isopropyl-/3 -D-thiogalactopyranoside (IPTG) can bind to lacl tetramers. 
As a result, the latter cannot bind to the operator region in the neighbourhood 
of the promoter Ptrc — 2, i.e., lacl can no longer repress the production of the 
cl proteins. Suppose the bistable system is originally in the stable State 2 (high 
U ( lacl ), low V ( cl )). On the induction of IPTG, the concentration of the 
cl proteins increases as a function of time. The cl proteins in their turn repress 
the production of lacl proteins the concentration of which begins to fall. The 
dynamics ultimately leads the system to the other fixed point (State 1). The 
system remains in this stable steady state (low U / high V ) even after the 
removal of the IPTG stimulus. How can the toggle flip back to State 2 ? This 
is achieved by using a temperature-sensitive cl protein in the network. The 
degradation rate of this protein increases as temperature is raised. On raising 
the temperature to 42°C (actual experiment), the concentration of cl proteins 
starts to fall. Since repression is less, the concentration of lacl proteins starts 
to go up. 

The system finally reaches the fixed point corresponding to the stable steady 
State 2. After the steady state is reached, the temperature of the system is re- 
duced (32°C in the experiment). The system continues to remain in the steady 
State 2. A full cycle of the switching process is now completed. The actual con- 
struction of the toggle switch has been accomplished in E.coli using the standard 
tools of molecular biology p| . There is a reasonable agreement between the 
theoretical predictions based on Eqs.(13) and (14) and the results obtained from 
experiments on the synthetic toggle network. The design of the network relies 
significantly on theoretical inputs like identification of the region of bistability, 
increasing the cooperativity in repression ((3 and 7 ) to achieve bistability over 
a wider region in parameter space etc. As a practical device, the toggle switch 
may have applications in biotechnology, biocomputing and gene therapy. As a 
cellular memory unit, the toggle provides the basis for "genetic applets" which 
are self-contained, programmable synthetic gene networks used in the control of 
cell functions. In parallel with the toggle work, another synthetic network, the 
repressilator has been designed and tested |35|. The repressilator dynamics is 
again nonlinear and give rise to oscillations in the concentrations of the cellular 
proteins. The design of the network is based on a simple mathematical model of 
transcriptional regulation. The repressilator provides insight about the design 
principles of other oscillatory systems such as circadian clocks found in many 
organisms including cyanobacteria. The genetic toggle switch and the repres- 
silator demonstrate that theoretical models can provide the design criteria for 
the actual construction of synthetic, gene regulatory networks. These simple 
networks have applications as practical devices and also help us to understand 
the functional properties of the more complex, naturally-occurring networks. 

Nonlinear dynamics can give rise to various types of instability one of which 
is the Turing instability. In 1952, Turing |36| in a seminal paper proposed a 
mechanism for pattern formation in biological systems as well as the develop- 
ment of structure during the growth of an organism. Examples of biological 
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patterns are the spots on the skin of a leopard, the stripes of a zebra, the ar- 
rangement of veins on the leaves of a tree etc. Structure formation is initiated 
by the process of cell differentiation, an example is the emergence of limbs in 
an organism during the growth of the organism from the featureless embryonic 
stage. The Turing mechanism involves both reaction as well as diffusion pro- 
cesses. To illustrate the mechanism, consider two chemical agents, the activator 
and the inhibitor. The activator is autocatalytic, i.e., it promotes its own pro- 
duction as well as that of the inhibitor. The inhibitor, as the name implies, is 
antagonistic to the activator and represses its production. Both the chemicals 
can diffuse but the inhibitor has a much larger diffusion coefficient. Consider 
a homogeneous distribution of the activator and the inhibitor in the system. 
Increase the concentration of the activator by a small amount in a local region. 
This gives rise to further increases in the local concentrations of the activator 
and the inhibitor. The inhibitor quickly diffuses to the surrounding region and 
prevents the activator from reaching there. Thus, in the steady state, islands 
of high activator concentration exist in a sea of high inhibitor concentration. 
The islands constitute what is known as the Turing pattern. Diffusion in gen- 
eral smooths out concentration differences in a system but the Turing process 
involving both reaction and diffusion gives rise to a steady pattern of concentra- 
tion gradients. There is now increasing evidence that chemical gradients play 
a crucial role in the formation of patterns and cell differentiation in biologi- 
cal systems. To give an example, the protein bicoid has been found to have a 
graded concentration distribution in the Drosophila melanogaster embryo. It 
is responsible for the organization of the anterior half of the fly and has been 
fully characterised Many reaction-diffusion (RD) models have been 

proposed based on the Turing mechanism and some of these can reproduce the 
patterns observed in nature p9| , p0| , ^j] , The basic scale of a pattern is 

larger than the size of an individual cell and so the RD processes involve more 
than one cell. Cells possibly choose developmental pathways depending upon 
their location in the concentration gradient. Position-dependent activation of 
genetic swtches in the cells may constitute an important step in both pattern 
and structure formation. Direct evidence for this in terms of a detailed charac- 
terization of the genes involved and an identification of the actual biochemical 
reactions occurring in the cells, is, however, yet to be obtained. Turing patterns 
have so far been experimentally observed in certain chemical RD systems in the 
laboratory Q and also in some biological systems JI^, . 



4 Effect of stochasticity 

As already mentioned in the Introduction, stochastic fluctuations in the dy- 
namics of the gene regulatory network lead to a probabilistic selection of devel- 
opmental pathways. The A-phage lysis-lysogeny network was discussed as 
an example. Fig. 4 shows some of the key components of the network. The 
complexity of the full network is captured in Figure 1 of Ref. Jig ]. We confine 
our attention to the simpler network. It consists of two A -phage genes cl and 
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cro. The corresponding promoters are Prm and Pr respectively. Transcription 
of the gene cl ( cro ) expresses the regulatory protein A repressor ( Cro ). Both 
the proteins are capable of binding to the operator regions OrI , Or2 and Or3. 
They act antagonistically to control promoter activity. Transcription of the cl 
gene, initiated from the promoter Prm , takes place whenever there is no pro- 
tein of either type binding to 0^3 . The A repressor molecule has a dumbell 
shape and there is a tendency for two such molecules to bind and form a dimer. 
The operator region OrI has the highest affinity for the binding of A repressor 
dimer. The binding increases the affinity of Or2 for a second repressor dimer, 
i.e., a cooperative binding of dimers to the operator regions OrI and Or2 takes 
place. The A repressor has both negative and positive control. If the A repressor 
is present at Or2 , transcription of the cro gene is not possible. This is because 
the repressor covers part of the DNA that a RNAP molecule must have access 
to in order to recognize the promoter Pr , bind to it and initiate the transcrip- 
tion of the cro gene. The same repressor at Or2 exhibits positive control in 
helping a RNAP molecule to bind to the promoter Prm and begin transcrip- 
tion of the cl gene. The increase in the transcription rate is approximately 
tenfold |4^|. If Or2 is not occupied by the repressor, the transcription rate of 
the cl gene is low. The reason for the dramatic increase in the transcription rate 
is the following. The presence of a repressor dimer bound to Or2 leads to an 
increased affinity of Prm for RNAP because the polymerase is held at Prm not 
only by its contacts with the DNA but also due to the protein-protein contact 
with the repressor. In summary, a repressor dimer bound to Or2 , represses 
transcription from Pr but promotes transcription at Prm ■ 

The cro gene is transcribed only when the operator region 0^3 is either 
empty or has Cro dimer bound to it. The transcription of the cl gene cannot 
take place if the OrI and Or2 regions are occupied by either protein, A repres- 
sor and Cro. In the lysogenic state, all the phage genes are off except for one 
gene cl which produces the protein A repressor. The protein in turn binds to 
the operators OrI and Or2 in the form of dimers and activates the transcrip- 
tion of its own gene at Prm- The bound A repressor dimers further prevent 
transcription initiation at Pr . Irradiation of the lysogen with ultra-violet light 
inactivates A repressor making the synthesis of the second regulatory protein 
Cro possible. Cro promotes lytic growth and competes with the A repressor in 
occupying the same operator sites. Increased Cro production leads to a greater 
probability of Cro binding at 0^3 which prevents the initiation of transcription 
at Prm ■ The concentration of the A repressor starts to fall down as a result. 
The concentration of Cro proteins increases and when it reaches a level such 
that the operator regions OrI and Or2 begin to be occupied, the transcription 
at Pr is also halted. The switchover from the lysogenic to the lytic state is 
further possible by recA -mediated degradation of the A repressor ( recA is a 
catalytic protein ). 

Arkin et al have analysed the stochastic kinetics of the full A-phage 
network which consists of more genes and regulatory elements than shown in 
Figure 4. Their detailed investigations show that fluctuations in the rates of 
gene expression give rise to random patterns of protein production in individ- 
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ual cells and wide diversity in instantaneous protein concentrations across cell 
populations. Each cell has two developmental pathways: lytic and lysogenic. 
The pathway selection depends upon which protein, A repressor or Cro, takes 
control of the operator region. If it is the A repressor, the lysogenic pathway is 
chosen. If the Cro takes control, the lytic pathway is selected. Due to stochastic 
fluctuations, the concentrations of A repressor and Cro vary considerably from 
cell to cell tipping the balance in favour of one or the other pathway. As a result, 
initially homogeneous cell populations can partition randomly into distinct lytic 
and lysogenic subpopulations. Arkin et al have constructed a stochastic kinetic 
model of the A-phage circuit and based on model calculations predicted the 
fraction of infected cells selecting the lysogenic pathway at different phage:cell 
ratios. The theoretical results are consistent with the experimental results of 
Kourilsk y |4g}|. The kinetic model uses the stochastic formulation of chemical 
kinetics Jlj , |l5| , stochastic mechanisms of gene expression [|l2| and a statistical- 
thermodynamical model of promoter regulation [Q. Probabilistic selection of 
developmental pathways occurs in several other gene regulatory networks pro- 
ducing stochastic phenotypic outcomes. Some examples are given in Table 4 of 
Ref. @. 

We now describe the well-known Gillespie algorithm |l5| which is in- 
cresingly being used by biologists in the stochastic kinetic approach to the study 
of gene expression and regulation in different systems Let us consider a system 
of N chemicals participating in M reactions R^. The state of the system at 
any instant of time t is represented as {X\, ...,Xn) where Xi is the number of 
molecules of the ith chemical species. Two questions have to be answered to 
determine how the system evolves in time: (1) when will the next reaction occur 
and (2) what type of reaction will it be ? Let 

C^dt = the probability that an (p = 1,...,M) reaction occurs in the 
next infinitesimal time interval dt for a particular combination of the reactant 
molecules. Let be the number of distinct combinations of molecules available 
in the state (Xl, ...Xn) for the R^ reaction. 
As an example, consider the reaction 



A + B^C (15) 

Let Xi and Xi be the number of molecules of types A and B respectively. Then 
h = X X X 2 . Let 

a^dt = h^C^dt be the probability that an i? M reaction occurs in time (t, t+dt) 
given the system is in the state (X\, ...,X^) at time t. 

The reaction probability density function P{t, /j,)dr is the probability that given 
the state (X%, Xn) at time t, the next reaction will occur in the infinitesimal 
time interval (t + r, t + t + dr) and will be an ii M reaction, 

P(T,(j)dT =P (T)a^dT (16) 

where Po(t) is the probability that no reaction occurs in the time interval (t, t + 
t) and a^dr is the subsequent probability that an i? M reaction occurs in the 
time interval (t + r, t + r + dr). Now 
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Po(T + dr) =P (t) 



M 



1 — a v dr 



(17) 



where the expression inside the bracket is the probability that no reaction occurs 
in time dr from the state (Xi, Xx). Eq. (17) can be solved to obtain 



Pq(t) = exp 



M 



Substituting for Pq(t) in Eq. (16), one gets 



if < r < oo, fj, = 1, 



and 



P(t, h) = a^exp (-a r) 
, M and P(t, p,) = otherwise. 

a M = /i/iC^, (/i = 1, ...,M) 



A I 



a 



(18) 

(19) 
(20) 

(21) 



Now the goal is to generate a pair of random numbers (t, jj) acording to the 
probability distribution (19). To do this, use the standard random number 
generator to obtain two random numbers from the uniform distribution in the 
unit interval. Take 



1/ 

ao 



(22) 



(23) 



and \i is chosen to be the integer for which 

The pair of numbers (r, /i), (Eqs. (22) and (23)), belongs to the set of random 
pairs described by the probability density function P(t, /i). For a rigorous proof 
of this see Refs. JTJ, |l^]. Once (r,/i) are known, put 

t = t + r (24) 

and adjust the Xi values according to the reaction. If the reaction is the 
one shown in Eq.(15), both Xi and X 2 have to be decreased by 1 and X 3 , the 
number of molecules of C, increased by 1. 

The input values at time t = are h v , C v (v — 1, ...M) and the initial values 
of Xi(i — 1, N). The steps of the Gillespie algorithm are: 

Step 1 

Calculate a v = h v C v {v = 1, M) and ao = Ylu=i av - 
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Step 2 

Generate r\ and r^ with the help of a uniform random number generator. Cal- 
culate t and fi according to the formulae in Eqs. (22) and (23). 
Step 3 

Advance t by r (Eq.(24)) and adjust the Xi values according to R^. Then repeat 
the steps from Step 1 to further advance the system in time. 

Recently, Kierzek et |5^| al have used the Gillespie algorithm to study a 
stochastic kinetic model of prokaryotic gene expression. They explicitly consid- 
ered ten biochemical reactions: 

P + RNAP — ► P-RNAP (25) 
P-RNAP -> P + RNAP (26) 
P-RNAP -> TrRNAP (27) 

TrRNAP -> RBS + P + EIRNAP (28) 

where P denotes the promoter region of the gene and P_RNAP the bound 
promoter-RNAP complex. Reaction 2 (Eq.(26)) describes RNAP dissocia- 
tion and Reaction 3 the isomerization of "closed complex" to "open complex", 
TrRNAP is the activated RNAP— promoter complex. Reaction 4 describes 
clearance of promoter region by RNAP, EIRNAP stands for RNAP transcrib- 
ing the gene and synthesizing the mRNA molecule and RBS is the ribosome 



binding site on mRNA. The other reactions are: 

Ribosome + RBS -> RibRBS (29) 

RibRBS -» RBS + Ribosome (30) 

RibRBS -> EIRib + RBS (31) 

RBS -> decay (32) 

EIRib -> protein (33) 

Protein — > decay (34) 



Reactions 5-10 (Eqs. (29)-(34)) describe translation, mRNA decay and pro- 
tein degradation. Reaction 5 describes Ribosome binding to .RBS , the bound 
complex is designated as RibRBS . Reaction 6 is the dissociation of the bound 
complex. Reaction 7 describes Ribosome binding site clearance, EIRib is the 
Ribosome which translates the mRNA. Reaction 8 has degradation of RBS by 
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the enzyme RNAaseE. RNAaseE and Ribosomes are in competition to oc- 
cupy RBS. If RNAaseE binds first then it initiates the degradation of mRNA 
but does not interfere with the movement of the already bound Ribosomes en- 
gaged in the process of translation. Every Ribosome which successfully binds 
to the RBS completes translation of the protein. Reaction 9 corresponds to 
the completion of protein synthesis. Reaction 10 represents protein decay. The 
stochastic rate constants C'^s of the different reactions, needed as inputs to the 
Gillespie algorithm, can be calculated from the more familiar chemical rate con- 
stants listed in Kierzek et al's paper (5(J. For first order chemical reactions, the 
stochastic rate constant is equal to the rate constant of a chemical reaction. For 
second order reactions, the stochastic rate constant is equal to the rate constant 
divided by the volume of the system (in this case a cell) . 

In the last part of this Section, we describe a cooperative stochastic model of 
gene expression proposed by us ^o|. As already explained in the Introduction, 
the model has been constructed to explain the bimodal distribution in gene ex- 
pression observed in recent experiments. The model describes the transcription 
of a single gene with one promoter region. There is one operaor region to which 
a regulatory protein R can bind. This prevents the binding of a RNAP to 
the promoter so that transcription of the gene cannot be initiated. There is a 
finite probability that the bound R molecule dissociates from the operator at 
any instant of time. RNAP molecule then has a certain probability of binding 
to the promoter and initiating transcription. 

Each of the possibilities described above actually involves a series of physico- 
chemical processes, a detailed characterization of which is not required for the 
model of gene expression proposed by us. We represent a gene by a one- 
dimensional lattice of n + 2 sites. The first two sites represent the operator 
and promoter respectively. The lattice is a coarse-grained description of an 
actual gene. In reality, the operator and promoter regions may extend over a 
certain number of base pairs in the DNA and they can be overlapping or not. 
In our model, they are represented as single sites. Each of the other sites in the 
lattice represents a finite number of base pairs in the DNA molecule. 

The different physico-chemical processes are lumped together into a few sim- 
ple events which are random in nature. This lumping together avoids unnec- 
essary complexity that has no bearing on the basic nature of the process. The 
operator (O) and the promoter (P) together can be in four possible configu- 
rations: 10,01,00 and 11. The numbers "1" and "0" stand for "occupied" and 
"unoccupied". The configuration ij describes the occupation status of O (i) 
and P (j). For example, the configuration 10 corresponds to O being occupied 
by a R molecule and P being unoccupied. Similarly, in the configuration 01, 
O is unoccupied and P is occupied by a RNAP molecule. Binding of R and 
RNAP molecules are mutually exclusive so that the configuration 11 is strictly 
prohibited. Given a 00 configuration at time t, the transition probabilities to 
configurations 10 and 01 at time t + 1 are p\ and pi respectively. The proba- 
bility of remaining in the configuration 00 is 1 — p\ — P2- A 10 configuration at 
time t goes to a 00 configuration at time t + 1 with probability p^ and remains 
unchanged with probability 1 — pa. We have assumed all the probabilities to be 
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time-independent. The RNAP molecule once bound to the promoter initiates 
transcription in the next time step, i.e., the 01 configuration makes a transition 
to a 00 configuration with probability 1. The motion of RNAP is in the for- 
ward direction and the molecule covers a unit distance (the distance between 
two successive lattice sites) in each time step. Once the molecule reaches the 
last site of the lattice, the transcription ends and a mRNA is synthesized. 

The second major feature of our model is the cooperative binding of RNAP 
to the promoter, when an adjacent RNAP molecule is present. This implies 
that there is a higher probability of binding of RNAP to the promoter in one 
time step if another RNAP molecule is present at the site next to the promoter. 
In our model, the probability of cooperative binding is p4 which is larger than 
P2- The probabilities p\ and 1 — pi — P2 are changed to new values p$ and 
1 — P4 — P5 respectively. Degradation of mRNA is taken into account by as- 
suming the decay rate to be given by (mN, where N is the number of mRNAs 
present at time t. The number of mRNAs produced as a function of time is 
studied by Monte Carlo simulation. For the sake of simplicity, we have not 
tried to simulate protein levels or enzymatic products thereof, i.e., we study 
gene expression upto the level of transcription (mRNA synthesis). Since the 
number of protein molecules and converted products should be proportional to 
the number of mRNA molecules, no loss of generality is introduced by this 
simplification. The lattice consists of 52 sites (n = 50). Stochastic events are 
simulated with the help of a random number generator. The updating rule of 
our cellular automaton (CA) model is that in each time step t, the occupation 
status ( or 1) of each site (except for the O site) at time £ — 1 is transferred 
to the nearest-neighbour site towards the right. If the last site is 1 at t — 1, a 
mRNA is synthesized at t and the number of mRNAs increases by one. In the 
same time step, the configuration ij of OP is determined with the probabilities 
already specified. Thus, in each time step, the RNAP molecule, if present on 
the gene, moves forward by unit lattice distance (progression of transcription) 
followed by the updating of the OP configuration. Figure 5 shows the concen- 
tration [mRNA] of mRNA molecules in the cell as a function of time for the 
parameter values p\ = 0.5, P2 = 0.5, ps = 0.3, p^ = 0.85, p§ = 0.05 and n = 0.4. 
Note that an almost four-fold increase in the probability of RNAP binding is 
assumed due to cooperativity. The stochastic nature of the gene expression is 
evident from the figure, with random intervals between the bursts of activity. 
One also notices the presence of several bursts of large size. It is important 
to emphasize that the frequency of transition between high and low expression 
levels is a function of the parameter values chosen and may be low for certain 
parameter values. For the probability values considered, the two predominantly 
favourable states are when the gene expression is off (state 1) and when a large 
amount of gene expression takes place (state 2). In the absence of RNAP bind- 
ing, state 1 has greater weight but with the chance binding of RNAP to the 
promoter (probability pi for this is small), the weight shifts to state 2 until 
another stochastic event terminates cooperative binding and the gene reverts to 
state 1. The probability of obtaining a train of N successive transcribing RNAP 
molecules is p2P±~ 1 (l — Pa)- This is the geometric distribution function and the 
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mean and the variance of the distribution are given by and p re- 

spectively. For the probability values already specified, the simulation has been 
repeated for an ensemble of 3000 cells. For each cell, the time evolution is upto 
10000 time steps. Figure 6 shows the distribution of the number N(m) of cells 
versus the fraction m of the maximal number of mRNA molecules produced 
after 10000 time steps. Two distinct peaks are seen corresponding to zero and 
maximal gene expression. Such a bimodal distribution occurs over a range of 
parameter values. 

Some theories have been proposed so far to explain the so-called "all or none" 
phenomenon in gene expression [|l9|, BlJ |5^]. These theories are mostly based 
on an autocatalytic feedback mechanism, synthesis of the gene product gives 
rise to the transport or production of an activator molecule. While such pro- 
cesses are certainly possible, the bimodal distribution is a much more general 
phenomenon and has now been found in many types of cells, from bacterial 
to eukaryotic and for different types of promoters 0, [l?], The two major 
features of the model of gene expression that we have proposed are stochasticity 
and cooperative binding of RNAP. There is by now enough experimental evi- 
dence of stochasticity in gene expression. Our suggestion of cooperative binding 
of RNAP is novel and there is no direct experimental verification of the pro- 
posal as yet. There are some recent experiments which provide indirect evidence 
and these are discussed in Ref. |pp|| . 

5 Concluding remarks 

In this review we have given an elementary introduction to some of the major 
aspects of biological networks, namely, topological characteristics, nonlinear dy- 
namics and the role of stochasticity in gene expression and regulation. The main 
aim of the review is to highlight the usefulness of interdisciplinary approaches 
in the study of both natural and synthetic biological networks. Some important 
features of such networks have not been discussed in the review. One of these is 
the operational reliability of networks in spite of randomness in basic regulatory 
mechanisms. Many regulatory pathways do have highly predictable outcomes 
even when stochastic fluctuations are considerable. Cells adopt various strate- 
gies like populational transcriptional cooperation, checkpoints to ensure that 
cascaded events are appropriately synchronised, redundancy and feedback to 
achieve regulatory determinism. Some of these ideas are discussed in Ref. ||lo| . 
The complexity of biological networks raises the question of their functional 
stability. In particular, the issue of interest is the sensitivity of networks to 
variations in their biochemical parameters. Barkai and Leibler |^3| have stud- 
ied a biochemical network responsible for bacterial chemotaxis and shown that 
the functional properties of the network are robust, i.e., relatively insensitive 
to changes in biochemical parameters like reaction rate constants and enzy- 
matic concentrations. Bialek (54) has shown that extremely stable biochemical 
switches can be constructed from small numbers of molecules though intuitively 
one expects such systems to be prone to instability due to the inherent noise. 
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Metzler |55|] in a recent paper has shown that spatial fluctuations in the 
distribution of regulatory molecules play a non-trivial role in genetic switching 
processes. Apart from internal stochastic fluctuations, external noise originat- 
ing in random variations in the environment or in the externally set control 
parameters, may affect the functioning of a biological network. Hasty et al 
have proposed a synthetic genetic network in which external noise is utilised to 
operate a protein switch (short noise pulses are used to turn protein production 
"on" and "off"). In another novel application, external noise is used to amplify 
gene expression, i.e., protein production by a considerable amount. 

Genetic networks with many components are difficult to analyze using con- 
ventional techniques. Many parallels have been drawn in the functioning of 



genetic and electrical circuits |57, |58|]. In electrical engineering, there are well 
developed techniques of circuit analysis which can be used to characterise the 
operation of complex electrical networks. Some of these techniques are increas- 
ingly being used to study genetic networks. Engineers are familiar with some 
of the design principles of biological networks. Rapid transitions between the 
two stable states of a system can be brought about by positive feedback loops. 
Negative feedback loops control the value of an output parameter to be within 
a narrow range even if there are wide fluctuations in the input. Coincidence 
detection systems activate an output provided two or more events occur simul- 
taneously. Parallel connections enable a device to remain functional in the event 
of failures in one of the lines. One can give analogous examples from biology. 
One set of positive feedback loops is responsible for the rapid transition of cells 
into mitosis (division of cell nucleus), another set brings about the exit from 
mitosis in an irreversible manner. Gene transcription in eukaryotes involve co- 
incidence detection. A mRNA can be produced only if the promoters regulating 
gene expression are occupied by the different transcription factors. These ex- 
amples indicate that general principles govern the functioning of genetic and 
electrical networks though there are other aspects of such networks which are 
not common to both. Biological networks constitute a field of research the in- 
terdisciplinary nature of which will become more evident as we progress into 
the twenty first century. 
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Figure Captions 

Figure 1. Example of a network. The solid circles and lines denote the nodes 
and the links respectively. 

Figure 2. Schematic diagram of the synthetic genetic toggle switch network. 
There are two promoters Pl and Ptrc — 2 and two genes lad and cl. 
Figure 3. Graphical representation of the toggle equations (Eqs. (13) and 
(14)). U, V are the concentrations of the lad and cl proteins respectively. 
There are two stable steady states: State 1 (high V / low U ) and State 2 
(low V I high U ) and one unstable steady state. 

Figure 4. Some key components of the A-phage lysis-lysogeny network: cl, cro 
are the two genes, Prm and Pr are the two promoters and OrI,Or2 and 
Ofl3 are the three operator regions. 

Figure 5. Concentration of mRNA molecules [mRNA] in arbitrary units as a 
function of time t. The parameter values arepi = 0.5, P2 — 0.5, p% = 0.3, — 
0.85, p 5 = 0.05 and fi = 0.4. 

Figure 6. Distribution of no. N(m) of cells expressing fraction m of maxi- 
mal number of mRNA molecules produced after 10000 time steps. The total 
number of cells is 3000. The parameter values are as in Figure 5. 
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